MixMax Approximation as a Super-Gaussian Log-Spectral Amplitude Estimator for Speech Enhancement
نویسندگان
چکیده
For single-channel speech enhancement, most commonly, the noisy observation is described as the sum of the clean speech signal and the noise signal. For machine learning based enhancement schemes where speech and noise are modeled in the log-spectral domain, however, the log-spectrum of the noisy observation can be described as the maximum of the speech and noise log-spectrum to simplify statistical inference. This approximation is referred to as MixMax model or log-max approximation. In this paper, we show how this approximation can be used in combination with non-trained, blind speech and noise power estimators derived in the spectral domain. Our findings allow to interpret the MixMax based clean speech estimator as a super-Gaussian log-spectral amplitude estimator. This MixMax based estimator is embedded in a pre-trained speech enhancement scheme and compared to a log-spectral amplitude estimator based on an additive mixing model. Instrumental measures indicate that the MixMax based estimator causes less musical tones while it virtually yields the same quality for the enhanced speech signal.
منابع مشابه
Efficient β-order Perceptually Motivated Spectral Amplitude Bayesian Estimator Based On Chi-distribution for Speech Enhancement
The traditional Bayesian estimator of short-time spectral amplitude is based on the minimization of the squared-error cost function under the common Gaussian probability density function (pdf). The Gaussian distribution, however, is not the optimal probability distribution. To overcome this phenomenon, we considered to replace the traditional distribution hypothesis of spectral amplitude of spe...
متن کاملSpeech Enhancement Using a Multidimensional Mixture-Maximum Model
We present a single-microphone speech enhancement algorithm that models the log-spectrum of the noise-free speech signal by a multidimensional Gaussian mixture. The proposed estimator is based on an earlier study which uses the single-dimensional mixture-maximum (MIXMAX) model for the speech signal. The experimental study shows that there is only a marginal difference between the proposed exten...
متن کاملSpeech Enhancement Using Beta-order Mmse Spectral Amplitude Estimator with Laplacian Prior
This report addresses the problem of speech enhancement employing the Minimum Mean-Square Error (MMSE) of β-order Short Time Spectral Amplitude (STSA). We present an analytical solution for β-order MMSE estimator where Discrete Fourier Transform (DFT) coefficients of (clean) speech are modeled by Laplacian distributions. Using some approximations for the joint probability density function and t...
متن کاملSpeech Enhancement by MAP Spectral Amplitude Estimation Using a Super-Gaussian Speech Model
This contribution presents two spectral amplitude estimators for acoustical background noise suppression based on maximum a posteriori estimation and super-Gaussian statistical modelling of the speech DFT amplitudes. The probability density function of the speech spectral amplitude is modelled with a simple parametric function, which allows a high approximation accuracy for Laplaceor Gamma-dist...
متن کاملDistributed multichannel speech enhancement with minimum mean-square error short-time spectral amplitude, log-spectral amplitude, and spectral phase estimation
In this paper, the authors present optimal multichannel frequency domain estimators for minimum mean-square error (MMSE) short-time spectral amplitude (STSA), log-spectral amplitude (LSA), and spectral phase estimation in a widely distributed microphone configuration. The estimators utilize Rayleigh and Gaussian statistical models for the speech prior and noise likelihood with a diffuse noise f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017